Online Kernel Learning with a Near Optimal Sparsity Bound

نویسندگان

  • Lijun Zhang
  • Jinfeng Yi
  • Rong Jin
  • Ming Lin
  • Xiaofei He
چکیده

In this work, we focus on Online Sparse Kernel Learning that aims to online learn a kernel classifier with a bounded number of support vectors. Although many online learning algorithms have been proposed to learn a sparse kernel classifier, most of them fail to bound the number of support vectors used by the final solution which is the average of the intermediate kernel classifiers generated by online algorithms. The key idea of the proposed algorithm is to measure the difficulty in correctly classifying a training example by the derivative of a smooth loss function, and give a more chance to a difficult example to be a support vector than an easy one via a sampling scheme. Our analysis shows that when the loss function is smooth, the proposed algorithm yields similar performance guarantee as the standard online learning algorithm but with a near optimal number of support vectors (up to a poly(lnT ) factor). Our empirical study shows promising performance of the proposed algorithm compared to the state-of-the-art algorithms for online sparse kernel learning.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sparsity in machine learning: theory and practice

The thesis explores sparse machine learning algorithms for supervised (classification and re­ gression) and unsupervised (subspace methods) learning. For classification, we review the set covering machine (SCM) and propose new algorithms th a t directly minimise the SCMs sample compression generalisation error bounds during the training phase. Two of the resulting algo­ rithm s are proved to pr...

متن کامل

Supplementary Material: Online Kernel Learning with a Near Optimal Sparsity Bound

A. Proof of Theorem 2 We here prove a lower bound on the number of support vectors to achieve the optimal regret bound. First, we construct a set of n examples T1 = {(xi, yi)}i=1, where 〈κ(xi, ·), κ(xj , ·)〉Hκ = δij and yi ∈ {1,−1}. To make the construction, consider the degree-d polynomial kernel κ(x,y) = (xy) and an Euclidean space R where m > n. Since m > n, we can find a set of orthonormal ...

متن کامل

Approximation Vector Machines for Large-scale Online Learning

One of the most challenging problems in kernel online learning is to bound the model size and to promote model sparsity. Sparse models not only improve computation and memory usage, but also enhance the generalization capacity – a principle that concurs with the law of parsimony. However, inappropriate sparsity modeling may also significantly degrade the performance. In this paper, we propose A...

متن کامل

Structured Sparsity and Generalization

We present a data dependent generalization bound for a large class of regularized algorithms which implement structured sparsity constraints. The bound can be applied to standard squared-norm regularization, the Lasso, the group Lasso, some versions of the group Lasso with overlapping groups, multiple kernel learning and other regularization schemes. In all these cases competitive results are o...

متن کامل

Adaptive Control of a Class of Nonlinear Discrete-Time Systems with Online Kernel Learning

An Online Kernel Learning based Adaptive Control (OKL-AC) framework for discrete-time affine nonlinear systems is presented in this paper. A sparsity strategy is proposed to control the complexity of OKL identification model, meanwhile to make a trade-off between the demanded tracking precision and the complexity of the control law. The forward increasing and backward decreasing learning stages...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013